Overview

Dataset statistics

Number of variables16
Number of observations48895
Missing cells20141
Missing cells (%)2.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.0 MiB
Average record size in memory128.0 B

Variable types

Numeric10
Categorical6

Warnings

name has a high cardinality: 47905 distinct values High cardinality
host_name has a high cardinality: 11452 distinct values High cardinality
neighbourhood has a high cardinality: 221 distinct values High cardinality
last_review has a high cardinality: 1764 distinct values High cardinality
last_review has 10052 (20.6%) missing values Missing
reviews_per_month has 10052 (20.6%) missing values Missing
minimum_nights is highly skewed (γ1 = 21.82727453) Skewed
name is uniformly distributed Uniform
id has unique values Unique
number_of_reviews has 10052 (20.6%) zeros Zeros
availability_365 has 17533 (35.9%) zeros Zeros

Reproduction

Analysis started2021-02-09 14:46:23.551495
Analysis finished2021-02-09 14:47:11.583530
Duration48.03 seconds
Software versionpandas-profiling v2.10.1
Download configurationconfig.yaml

Variables

id
Real number (ℝ≥0)

UNIQUE

Distinct48895
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19017143.24
Minimum2539
Maximum36487245
Zeros0
Zeros (%)0.0%
Memory size382.1 KiB
2021-02-09T22:47:11.866891image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum2539
5-th percentile1222382.7
Q19471945
median19677284
Q329152178.5
95-th percentile35259101.2
Maximum36487245
Range36484706
Interquartile range (IQR)19680233.5

Descriptive statistics

Standard deviation10983108.39
Coefficient of variation (CV)0.5775372383
Kurtosis-1.227748342
Mean19017143.24
Median Absolute Deviation (MAD)9908242
Skewness-0.09025737546
Sum9.298432185 × 1011
Variance1.206286698 × 1014
MonotocityStrictly increasing
2021-02-09T22:47:12.154165image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
116674551
 
< 0.1%
78512191
 
< 0.1%
331382681
 
< 0.1%
16246651
 
< 0.1%
193874021
 
< 0.1%
185161031
 
< 0.1%
298028951
 
< 0.1%
199835751
 
< 0.1%
220786781
 
< 0.1%
336846931
 
< 0.1%
Other values (48885)48885
> 99.9%
ValueCountFrequency (%)
25391
< 0.1%
25951
< 0.1%
36471
< 0.1%
38311
< 0.1%
50221
< 0.1%
ValueCountFrequency (%)
364872451
< 0.1%
364856091
< 0.1%
364854311
< 0.1%
364850571
< 0.1%
364846651
< 0.1%

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct47905
Distinct (%)98.0%
Missing16
Missing (%)< 0.1%
Memory size382.1 KiB
Hillside Hotel
 
18
Home away from home
 
17
New york Multi-unit building
 
16
Brooklyn Apartment
 
12
Loft Suite @ The Box House Hotel
 
11
Other values (47900)
48805 

Length

Max length179
Median length37
Mean length36.91114794
Min length1

Characters and Unicode

Total characters1804180
Distinct characters776
Distinct categories20 ?
Distinct scripts11 ?
Distinct blocks17 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47260 ?
Unique (%)96.7%

Sample

1st rowClean & quiet apt home by the park
2nd rowSkylit Midtown Castle
3rd rowTHE VILLAGE OF HARLEM....NEW YORK !
4th rowCozy Entire Floor of Brownstone
5th rowEntire Apt: Spacious Studio/Loft by central park
ValueCountFrequency (%)
Hillside Hotel18
 
< 0.1%
Home away from home17
 
< 0.1%
New york Multi-unit building16
 
< 0.1%
Brooklyn Apartment12
 
< 0.1%
Loft Suite @ The Box House Hotel11
 
< 0.1%
Private Room11
 
< 0.1%
Private room10
 
< 0.1%
Artsy Private BR in Fort Greene Cumberland10
 
< 0.1%
Cozy Brooklyn Apartment8
 
< 0.1%
Beautiful Brooklyn Brownstone8
 
< 0.1%
Other values (47895)48758
99.7%
(Missing)16
 
< 0.1%
2021-02-09T22:47:13.048188image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
in16752
 
5.6%
room10038
 
3.4%
8430
 
2.8%
bedroom7601
 
2.5%
private7158
 
2.4%
apartment6695
 
2.2%
cozy4991
 
1.7%
apt4618
 
1.5%
brooklyn4049
 
1.4%
studio3988
 
1.3%
Other values (12552)224301
75.1%

Most occurring characters

ValueCountFrequency (%)
251424
 
13.9%
e124635
 
6.9%
o122324
 
6.8%
t105261
 
5.8%
a103586
 
5.7%
r97946
 
5.4%
i94651
 
5.2%
n94611
 
5.2%
l51723
 
2.9%
m49121
 
2.7%
Other values (766)708898
39.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1206208
66.9%
Uppercase Letter270574
 
15.0%
Space Separator251428
 
13.9%
Other Punctuation33826
 
1.9%
Decimal Number25321
 
1.4%
Dash Punctuation6878
 
0.4%
Math Symbol2738
 
0.2%
Other Letter2547
 
0.1%
Close Punctuation1537
 
0.1%
Open Punctuation1395
 
0.1%
Other values (10)1728
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
82
 
3.2%
46
 
1.8%
44
 
1.7%
41
 
1.6%
38
 
1.5%
37
 
1.5%
36
 
1.4%
36
 
1.4%
30
 
1.2%
29
 
1.1%
Other values (520)2128
83.5%
ValueCountFrequency (%)
e124635
 
10.3%
o122324
 
10.1%
t105261
 
8.7%
a103586
 
8.6%
r97946
 
8.1%
i94651
 
7.8%
n94611
 
7.8%
l51723
 
4.3%
m49121
 
4.1%
s48092
 
4.0%
Other values (58)314258
26.1%
ValueCountFrequency (%)
266
30.3%
168
19.1%
105
 
11.9%
38
 
4.3%
35
 
4.0%
34
 
3.9%
25
 
2.8%
15
 
1.7%
15
 
1.7%
14
 
1.6%
Other values (50)164
18.7%
ValueCountFrequency (%)
B29965
 
11.1%
S26481
 
9.8%
C20989
 
7.8%
A19424
 
7.2%
R17945
 
6.6%
P14623
 
5.4%
E14350
 
5.3%
L14062
 
5.2%
M11930
 
4.4%
N11701
 
4.3%
Other values (33)89104
32.9%
ValueCountFrequency (%)
,9177
27.1%
!7855
23.2%
/5230
15.5%
.4375
12.9%
&3182
 
9.4%
'1074
 
3.2%
*1021
 
3.0%
:597
 
1.8%
#555
 
1.6%
"294
 
0.9%
Other values (11)466
 
1.4%
ValueCountFrequency (%)
+1382
50.5%
|992
36.2%
~271
 
9.9%
=34
 
1.2%
>25
 
0.9%
<20
 
0.7%
6
 
0.2%
4
 
0.1%
2
 
0.1%
×1
 
< 0.1%
ValueCountFrequency (%)
18661
34.2%
26830
27.0%
32560
 
10.1%
52164
 
8.5%
02115
 
8.4%
41307
 
5.2%
6569
 
2.2%
7450
 
1.8%
8399
 
1.6%
9266
 
1.1%
ValueCountFrequency (%)
(1339
96.0%
[36
 
2.6%
{9
 
0.6%
8
 
0.6%
3
 
0.2%
ValueCountFrequency (%)
)1480
96.3%
]37
 
2.4%
}9
 
0.6%
8
 
0.5%
3
 
0.2%
ValueCountFrequency (%)
-6804
98.9%
47
 
0.7%
26
 
0.4%
1
 
< 0.1%
ValueCountFrequency (%)
^9
56.2%
`4
25.0%
´3
 
18.8%
ValueCountFrequency (%)
21
56.8%
11
29.7%
5
 
13.5%
ValueCountFrequency (%)
251424
> 99.9%
 4
 
< 0.1%
ValueCountFrequency (%)
200
84.0%
38
 
16.0%
ValueCountFrequency (%)
165
92.2%
14
 
7.8%
ValueCountFrequency (%)
_42
97.7%
1
 
2.3%
ValueCountFrequency (%)
40
83.3%
8
 
16.7%
ValueCountFrequency (%)
$94
100.0%
ValueCountFrequency (%)
²9
100.0%
ValueCountFrequency (%)
185
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1476579
81.8%
Common324672
 
18.0%
Han2237
 
0.1%
Cyrillic191
 
< 0.1%
Inherited179
 
< 0.1%
Katakana136
 
< 0.1%
Hiragana70
 
< 0.1%
Hangul70
 
< 0.1%
Hebrew31
 
< 0.1%
Georgian13
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
82
 
3.7%
46
 
2.1%
44
 
2.0%
41
 
1.8%
38
 
1.7%
37
 
1.7%
36
 
1.6%
36
 
1.6%
30
 
1.3%
29
 
1.3%
Other values (401)1818
81.3%
ValueCountFrequency (%)
251424
77.4%
,9177
 
2.8%
18661
 
2.7%
!7855
 
2.4%
26830
 
2.1%
-6804
 
2.1%
/5230
 
1.6%
.4375
 
1.3%
&3182
 
1.0%
32560
 
0.8%
Other values (123)18574
 
5.7%
ValueCountFrequency (%)
e124635
 
8.4%
o122324
 
8.3%
t105261
 
7.1%
a103586
 
7.0%
r97946
 
6.6%
i94651
 
6.4%
n94611
 
6.4%
l51723
 
3.5%
m49121
 
3.3%
s48092
 
3.3%
Other values (68)584629
39.6%
ValueCountFrequency (%)
7
 
10.0%
3
 
4.3%
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
Other values (38)43
61.4%
ValueCountFrequency (%)
а26
13.6%
о18
 
9.4%
т17
 
8.9%
н15
 
7.9%
е13
 
6.8%
к11
 
5.8%
р11
 
5.8%
м10
 
5.2%
с9
 
4.7%
в9
 
4.7%
Other values (23)52
27.2%
ValueCountFrequency (%)
14
 
10.3%
12
 
8.8%
10
 
7.4%
9
 
6.6%
9
 
6.6%
9
 
6.6%
8
 
5.9%
7
 
5.1%
6
 
4.4%
6
 
4.4%
Other values (22)46
33.8%
ValueCountFrequency (%)
16
22.9%
7
10.0%
7
10.0%
6
 
8.6%
5
 
7.1%
4
 
5.7%
4
 
5.7%
3
 
4.3%
2
 
2.9%
2
 
2.9%
Other values (13)14
20.0%
ValueCountFrequency (%)
י5
16.1%
ו5
16.1%
ב4
12.9%
ר4
12.9%
ע2
 
6.5%
ת2
 
6.5%
ה2
 
6.5%
ד1
 
3.2%
ש1
 
3.2%
ל1
 
3.2%
Other values (4)4
12.9%
ValueCountFrequency (%)
165
92.2%
14
 
7.8%
ValueCountFrequency (%)
13
100.0%
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1799687
99.8%
CJK2237
 
0.1%
Misc Symbols500
 
< 0.1%
None431
 
< 0.1%
Punctuation423
 
< 0.1%
Dingbats320
 
< 0.1%
Cyrillic191
 
< 0.1%
VS179
 
< 0.1%
Hiragana70
 
< 0.1%
Hangul70
 
< 0.1%
Other values (7)72
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
251424
 
14.0%
e124635
 
6.9%
o122324
 
6.8%
t105261
 
5.8%
a103586
 
5.8%
r97946
 
5.4%
i94651
 
5.3%
n94611
 
5.3%
l51723
 
2.9%
m49121
 
2.7%
Other values (86)704405
39.1%
ValueCountFrequency (%)
200
47.3%
62
 
14.7%
47
 
11.1%
40
 
9.5%
38
 
9.0%
26
 
6.1%
8
 
1.9%
1
 
0.2%
1
 
0.2%
ValueCountFrequency (%)
35
 
8.1%
à28
 
6.5%
ó24
 
5.6%
21
 
4.9%
é16
 
3.7%
15
 
3.5%
14
 
3.2%
·13
 
3.0%
12
 
2.8%
11
 
2.6%
Other values (70)242
56.1%
ValueCountFrequency (%)
165
92.2%
14
 
7.8%
ValueCountFrequency (%)
266
53.2%
105
 
21.0%
38
 
7.6%
15
 
3.0%
11
 
2.2%
8
 
1.6%
6
 
1.2%
6
 
1.2%
6
 
1.2%
6
 
1.2%
Other values (12)33
 
6.6%
ValueCountFrequency (%)
168
52.5%
34
 
10.6%
25
 
7.8%
15
 
4.7%
14
 
4.4%
11
 
3.4%
8
 
2.5%
6
 
1.9%
5
 
1.6%
4
 
1.2%
Other values (13)30
 
9.4%
ValueCountFrequency (%)
1
100.0%
ValueCountFrequency (%)
82
 
3.7%
46
 
2.1%
44
 
2.0%
41
 
1.8%
38
 
1.7%
37
 
1.7%
36
 
1.6%
36
 
1.6%
30
 
1.3%
29
 
1.3%
Other values (401)1818
81.3%
ValueCountFrequency (%)
16
22.9%
7
10.0%
7
10.0%
6
 
8.6%
5
 
7.1%
4
 
5.7%
4
 
5.7%
3
 
4.3%
2
 
2.9%
2
 
2.9%
Other values (13)14
20.0%
ValueCountFrequency (%)
а26
13.6%
о18
 
9.4%
т17
 
8.9%
н15
 
7.9%
е13
 
6.8%
к11
 
5.8%
р11
 
5.8%
м10
 
5.2%
с9
 
4.7%
в9
 
4.7%
Other values (23)52
27.2%
ValueCountFrequency (%)
י5
16.1%
ו5
16.1%
ב4
12.9%
ר4
12.9%
ע2
 
6.5%
ת2
 
6.5%
ה2
 
6.5%
ד1
 
3.2%
ש1
 
3.2%
ל1
 
3.2%
Other values (4)4
12.9%
ValueCountFrequency (%)
13
100.0%
ValueCountFrequency (%)
7
 
10.0%
3
 
4.3%
3
 
4.3%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
2
 
2.9%
Other values (38)43
61.4%
ValueCountFrequency (%)
4
36.4%
2
18.2%
2
18.2%
2
18.2%
1
 
9.1%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
4
57.1%
2
28.6%
1
 
14.3%
ValueCountFrequency (%)
4
57.1%
1
 
14.3%
1
 
14.3%
1
 
14.3%

host_id
Real number (ℝ≥0)

Distinct37457
Distinct (%)76.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67620010.65
Minimum2438
Maximum274321313
Zeros0
Zeros (%)0.0%
Memory size382.1 KiB
2021-02-09T22:47:13.382196image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum2438
5-th percentile815564.1
Q17822033
median30793816
Q3107434423
95-th percentile241764600.2
Maximum274321313
Range274318875
Interquartile range (IQR)99612390

Descriptive statistics

Standard deviation78610967.03
Coefficient of variation (CV)1.162539998
Kurtosis0.1691057556
Mean67620010.65
Median Absolute Deviation (MAD)27543913
Skewness1.206213924
Sum3.306280421 × 1012
Variance6.179684138 × 1015
MonotocityNot monotonic
2021-02-09T22:47:13.653520image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
219517861327
 
0.7%
107434423232
 
0.5%
30283594121
 
0.2%
137358866103
 
0.2%
1224305196
 
0.2%
1609895896
 
0.2%
6139196391
 
0.2%
2254157387
 
0.2%
20038061065
 
0.1%
750364352
 
0.1%
Other values (37447)47625
97.4%
ValueCountFrequency (%)
24381
 
< 0.1%
25711
 
< 0.1%
27876
< 0.1%
28452
 
< 0.1%
28681
 
< 0.1%
ValueCountFrequency (%)
2743213131
< 0.1%
2743114611
< 0.1%
2743076001
< 0.1%
2742984531
< 0.1%
2742732841
< 0.1%

host_name
Categorical

HIGH CARDINALITY

Distinct11452
Distinct (%)23.4%
Missing21
Missing (%)< 0.1%
Memory size382.1 KiB
Michael
 
417
David
 
403
Sonder (NYC)
 
327
John
 
294
Alex
 
279
Other values (11447)
47154 

Length

Max length35
Median length6
Mean length6.12487212
Min length1

Characters and Unicode

Total characters299347
Distinct characters204
Distinct categories15 ?
Distinct scripts7 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6903 ?
Unique (%)14.1%

Sample

1st rowJohn
2nd rowJennifer
3rd rowElisabeth
4th rowLisaRoxanne
5th rowLaura
ValueCountFrequency (%)
Michael417
 
0.9%
David403
 
0.8%
Sonder (NYC)327
 
0.7%
John294
 
0.6%
Alex279
 
0.6%
Blueground232
 
0.5%
Sarah227
 
0.5%
Daniel226
 
0.5%
Jessica205
 
0.4%
Maria204
 
0.4%
Other values (11442)46060
94.2%
2021-02-09T22:47:14.327296image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1120
 
2.1%
and625
 
1.1%
michael460
 
0.8%
david449
 
0.8%
sonder423
 
0.8%
nyc338
 
0.6%
john337
 
0.6%
alex330
 
0.6%
laura293
 
0.5%
maria244
 
0.4%
Other values (10259)49968
91.5%

Most occurring characters

ValueCountFrequency (%)
a37929
 
12.7%
e28680
 
9.6%
i24284
 
8.1%
n24092
 
8.0%
r17861
 
6.0%
l15327
 
5.1%
o12743
 
4.3%
t9401
 
3.1%
s9147
 
3.1%
h9040
 
3.0%
Other values (194)110843
37.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter235916
78.8%
Uppercase Letter54823
 
18.3%
Space Separator5811
 
1.9%
Other Punctuation1592
 
0.5%
Open Punctuation381
 
0.1%
Close Punctuation379
 
0.1%
Dash Punctuation209
 
0.1%
Other Letter110
 
< 0.1%
Decimal Number84
 
< 0.1%
Math Symbol34
 
< 0.1%
Other values (5)8
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
6
 
5.5%
5
 
4.5%
5
 
4.5%
5
 
4.5%
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
2
 
1.8%
2
 
1.8%
Other values (62)72
65.5%
ValueCountFrequency (%)
a37929
16.1%
e28680
12.2%
i24284
10.3%
n24092
10.2%
r17861
 
7.6%
l15327
 
6.5%
o12743
 
5.4%
t9401
 
4.0%
s9147
 
3.9%
h9040
 
3.8%
Other values (54)47412
20.1%
ValueCountFrequency (%)
A6458
11.8%
J5458
 
10.0%
M5298
 
9.7%
S4744
 
8.7%
C3737
 
6.8%
L2885
 
5.3%
D2752
 
5.0%
K2618
 
4.8%
R2566
 
4.7%
E2361
 
4.3%
Other values (28)15946
29.1%
ValueCountFrequency (%)
520
23.8%
014
16.7%
714
16.7%
211
13.1%
17
 
8.3%
47
 
8.3%
34
 
4.8%
64
 
4.8%
82
 
2.4%
91
 
1.2%
ValueCountFrequency (%)
&1162
73.0%
.309
 
19.4%
/41
 
2.6%
,35
 
2.2%
'25
 
1.6%
@8
 
0.5%
"6
 
0.4%
!4
 
0.3%
:2
 
0.1%
ValueCountFrequency (%)
5805
99.9%
6
 
0.1%
ValueCountFrequency (%)
+34
100.0%
ValueCountFrequency (%)
(381
100.0%
ValueCountFrequency (%)
)379
100.0%
ValueCountFrequency (%)
-209
100.0%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
£1
100.0%
ValueCountFrequency (%)
_1
100.0%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin290683
97.1%
Common8498
 
2.8%
Han91
 
< 0.1%
Cyrillic56
 
< 0.1%
Hangul11
 
< 0.1%
Hebrew5
 
< 0.1%
Hiragana3
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
a37929
 
13.0%
e28680
 
9.9%
i24284
 
8.4%
n24092
 
8.3%
r17861
 
6.1%
l15327
 
5.3%
o12743
 
4.4%
t9401
 
3.2%
s9147
 
3.1%
h9040
 
3.1%
Other values (70)102179
35.2%
ValueCountFrequency (%)
6
 
6.6%
5
 
5.5%
5
 
5.5%
5
 
5.5%
4
 
4.4%
3
 
3.3%
3
 
3.3%
3
 
3.3%
2
 
2.2%
2
 
2.2%
Other values (45)53
58.2%
ValueCountFrequency (%)
5805
68.3%
&1162
 
13.7%
(381
 
4.5%
)379
 
4.5%
.309
 
3.6%
-209
 
2.5%
/41
 
0.5%
,35
 
0.4%
+34
 
0.4%
'25
 
0.3%
Other values (20)118
 
1.4%
ValueCountFrequency (%)
е6
10.7%
н6
10.7%
а6
10.7%
А4
 
7.1%
л4
 
7.1%
и4
 
7.1%
к3
 
5.4%
с3
 
5.4%
й3
 
5.4%
р3
 
5.4%
Other values (12)14
25.0%
ValueCountFrequency (%)
2
18.2%
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
ValueCountFrequency (%)
ד1
20.0%
נ1
20.0%
י1
20.0%
א1
20.0%
ל1
20.0%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII298922
99.9%
None247
 
0.1%
CJK91
 
< 0.1%
Cyrillic56
 
< 0.1%
Hangul11
 
< 0.1%
Punctuation10
 
< 0.1%
Hebrew5
 
< 0.1%
Hiragana3
 
< 0.1%
Misc Symbols2
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
a37929
 
12.7%
e28680
 
9.6%
i24284
 
8.1%
n24092
 
8.1%
r17861
 
6.0%
l15327
 
5.1%
o12743
 
4.3%
t9401
 
3.1%
s9147
 
3.1%
h9040
 
3.0%
Other values (67)110418
36.9%
ValueCountFrequency (%)
é107
43.3%
í24
 
9.7%
á22
 
8.9%
ú19
 
7.7%
ë13
 
5.3%
ô11
 
4.5%
ó9
 
3.6%
è7
 
2.8%
ç5
 
2.0%
ï4
 
1.6%
Other values (19)26
 
10.5%
ValueCountFrequency (%)
6
 
6.6%
5
 
5.5%
5
 
5.5%
5
 
5.5%
4
 
4.4%
3
 
3.3%
3
 
3.3%
3
 
3.3%
2
 
2.2%
2
 
2.2%
Other values (45)53
58.2%
ValueCountFrequency (%)
6
60.0%
2
 
20.0%
2
 
20.0%
ValueCountFrequency (%)
2
18.2%
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
е6
10.7%
н6
10.7%
а6
10.7%
А4
 
7.1%
л4
 
7.1%
и4
 
7.1%
к3
 
5.4%
с3
 
5.4%
й3
 
5.4%
р3
 
5.4%
Other values (12)14
25.0%
ValueCountFrequency (%)
ד1
20.0%
נ1
20.0%
י1
20.0%
א1
20.0%
ל1
20.0%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size382.1 KiB
Manhattan
21661 
Brooklyn
20104 
Queens
5666 
Bronx
 
1091
Staten Island
 
373

Length

Max length13
Median length8
Mean length8.182452193
Min length5

Characters and Unicode

Total characters400081
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBrooklyn
2nd rowManhattan
3rd rowManhattan
4th rowBrooklyn
5th rowManhattan
ValueCountFrequency (%)
Manhattan21661
44.3%
Brooklyn20104
41.1%
Queens5666
 
11.6%
Bronx1091
 
2.2%
Staten Island373
 
0.8%
2021-02-09T22:47:14.837953image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-02-09T22:47:14.994210image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
manhattan21661
44.0%
brooklyn20104
40.8%
queens5666
 
11.5%
bronx1091
 
2.2%
island373
 
0.8%
staten373
 
0.8%

Most occurring characters

ValueCountFrequency (%)
n70929
17.7%
a65729
16.4%
t44068
11.0%
o41299
10.3%
M21661
 
5.4%
h21661
 
5.4%
B21195
 
5.3%
r21195
 
5.3%
l20477
 
5.1%
k20104
 
5.0%
Other values (10)51763
12.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter350440
87.6%
Uppercase Letter49268
 
12.3%
Space Separator373
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
n70929
20.2%
a65729
18.8%
t44068
12.6%
o41299
11.8%
h21661
 
6.2%
r21195
 
6.0%
l20477
 
5.8%
k20104
 
5.7%
y20104
 
5.7%
e11705
 
3.3%
Other values (4)13169
 
3.8%
ValueCountFrequency (%)
M21661
44.0%
B21195
43.0%
Q5666
 
11.5%
S373
 
0.8%
I373
 
0.8%
ValueCountFrequency (%)
373
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin399708
99.9%
Common373
 
0.1%

Most frequent character per script

ValueCountFrequency (%)
n70929
17.7%
a65729
16.4%
t44068
11.0%
o41299
10.3%
M21661
 
5.4%
h21661
 
5.4%
B21195
 
5.3%
r21195
 
5.3%
l20477
 
5.1%
k20104
 
5.0%
Other values (9)51390
12.9%
ValueCountFrequency (%)
373
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII400081
100.0%

Most frequent character per block

ValueCountFrequency (%)
n70929
17.7%
a65729
16.4%
t44068
11.0%
o41299
10.3%
M21661
 
5.4%
h21661
 
5.4%
B21195
 
5.3%
r21195
 
5.3%
l20477
 
5.1%
k20104
 
5.0%
Other values (10)51763
12.9%

neighbourhood
Categorical

HIGH CARDINALITY

Distinct221
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size382.1 KiB
Williamsburg
3920 
Bedford-Stuyvesant
3714 
Harlem
 
2658
Bushwick
 
2465
Upper West Side
 
1971
Other values (216)
34167 

Length

Max length26
Median length12
Mean length11.89479497
Min length4

Characters and Unicode

Total characters581596
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowKensington
2nd rowMidtown
3rd rowHarlem
4th rowClinton Hill
5th rowEast Harlem
ValueCountFrequency (%)
Williamsburg3920
 
8.0%
Bedford-Stuyvesant3714
 
7.6%
Harlem2658
 
5.4%
Bushwick2465
 
5.0%
Upper West Side1971
 
4.0%
Hell's Kitchen1958
 
4.0%
East Village1853
 
3.8%
Upper East Side1798
 
3.7%
Crown Heights1564
 
3.2%
Midtown1545
 
3.2%
Other values (211)25449
52.0%
2021-02-09T22:47:15.693594image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
east6592
 
8.3%
side4680
 
5.9%
williamsburg3920
 
5.0%
harlem3775
 
4.8%
upper3769
 
4.8%
bedford-stuyvesant3714
 
4.7%
heights3586
 
4.5%
village3164
 
4.0%
west2759
 
3.5%
bushwick2465
 
3.1%
Other values (233)40681
51.4%

Most occurring characters

ValueCountFrequency (%)
e53470
 
9.2%
i42282
 
7.3%
s39625
 
6.8%
t38587
 
6.6%
a37608
 
6.5%
l34448
 
5.9%
r33667
 
5.8%
30210
 
5.2%
n26099
 
4.5%
o24032
 
4.1%
Other values (44)221568
38.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter461107
79.3%
Uppercase Letter83934
 
14.4%
Space Separator30210
 
5.2%
Dash Punctuation4251
 
0.7%
Other Punctuation2094
 
0.4%

Most frequent character per category

ValueCountFrequency (%)
e53470
11.6%
i42282
 
9.2%
s39625
 
8.6%
t38587
 
8.4%
a37608
 
8.2%
l34448
 
7.5%
r33667
 
7.3%
n26099
 
5.7%
o24032
 
5.2%
d19663
 
4.3%
Other values (15)111626
24.2%
ValueCountFrequency (%)
H11901
14.2%
S11483
13.7%
B8374
10.0%
W8185
9.8%
E7084
8.4%
C5327
 
6.3%
U3833
 
4.6%
G3723
 
4.4%
F3281
 
3.9%
V3209
 
3.8%
Other values (14)17534
20.9%
ValueCountFrequency (%)
'1968
94.0%
.124
 
5.9%
,2
 
0.1%
ValueCountFrequency (%)
30210
100.0%
ValueCountFrequency (%)
-4251
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin545041
93.7%
Common36555
 
6.3%

Most frequent character per script

ValueCountFrequency (%)
e53470
 
9.8%
i42282
 
7.8%
s39625
 
7.3%
t38587
 
7.1%
a37608
 
6.9%
l34448
 
6.3%
r33667
 
6.2%
n26099
 
4.8%
o24032
 
4.4%
d19663
 
3.6%
Other values (39)195560
35.9%
ValueCountFrequency (%)
30210
82.6%
-4251
 
11.6%
'1968
 
5.4%
.124
 
0.3%
,2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII581596
100.0%

Most frequent character per block

ValueCountFrequency (%)
e53470
 
9.2%
i42282
 
7.3%
s39625
 
6.8%
t38587
 
6.6%
a37608
 
6.5%
l34448
 
5.9%
r33667
 
5.8%
30210
 
5.2%
n26099
 
4.5%
o24032
 
4.1%
Other values (44)221568
38.1%

latitude
Real number (ℝ≥0)

Distinct19048
Distinct (%)39.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.72894888
Minimum40.49979
Maximum40.91306
Zeros0
Zeros (%)0.0%
Memory size382.1 KiB
2021-02-09T22:47:15.983314image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum40.49979
5-th percentile40.646114
Q140.6901
median40.72307
Q340.763115
95-th percentile40.825643
Maximum40.91306
Range0.41327
Interquartile range (IQR)0.073015

Descriptive statistics

Standard deviation0.05453007806
Coefficient of variation (CV)0.001338853065
Kurtosis0.1488446574
Mean40.72894888
Median Absolute Deviation (MAD)0.03642
Skewness0.2371665585
Sum1991441.956
Variance0.002973529413
MonotocityNot monotonic
2021-02-09T22:47:16.239149image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.7181318
 
< 0.1%
40.6863413
 
< 0.1%
40.6941413
 
< 0.1%
40.6844413
 
< 0.1%
40.7117112
 
< 0.1%
40.6853712
 
< 0.1%
40.7618912
 
< 0.1%
40.7612512
 
< 0.1%
40.7135312
 
< 0.1%
40.6905411
 
< 0.1%
Other values (19038)48767
99.7%
ValueCountFrequency (%)
40.499791
< 0.1%
40.506411
< 0.1%
40.507081
< 0.1%
40.508681
< 0.1%
40.508731
< 0.1%
ValueCountFrequency (%)
40.913061
< 0.1%
40.912341
< 0.1%
40.911691
< 0.1%
40.911671
< 0.1%
40.908041
< 0.1%

longitude
Real number (ℝ)

Distinct14718
Distinct (%)30.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.95216961
Minimum-74.24442
Maximum-73.71299
Zeros0
Zeros (%)0.0%
Memory size382.1 KiB
2021-02-09T22:47:16.536039image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-74.24442
5-th percentile-74.00388
Q1-73.98307
median-73.95568
Q3-73.936275
95-th percentile-73.865771
Maximum-73.71299
Range0.53143
Interquartile range (IQR)0.046795

Descriptive statistics

Standard deviation0.04615673611
Coefficient of variation (CV)-0.0006241430961
Kurtosis5.021646112
Mean-73.95216961
Median Absolute Deviation (MAD)0.02485
Skewness1.284210209
Sum-3615891.333
Variance0.002130444288
MonotocityNot monotonic
2021-02-09T22:47:16.865172image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.9567718
 
< 0.1%
-73.9542718
 
< 0.1%
-73.9540517
 
< 0.1%
-73.9513616
 
< 0.1%
-73.9479116
 
< 0.1%
-73.950616
 
< 0.1%
-73.9533216
 
< 0.1%
-73.9572515
 
< 0.1%
-73.9858915
 
< 0.1%
-73.9566915
 
< 0.1%
Other values (14708)48733
99.7%
ValueCountFrequency (%)
-74.244421
< 0.1%
-74.242851
< 0.1%
-74.240841
< 0.1%
-74.239861
< 0.1%
-74.239141
< 0.1%
ValueCountFrequency (%)
-73.712991
< 0.1%
-73.71691
< 0.1%
-73.717951
< 0.1%
-73.718291
< 0.1%
-73.719281
< 0.1%

room_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size382.1 KiB
Entire home/apt
25409 
Private room
22326 
Shared room
 
1160

Length

Max length15
Median length15
Mean length13.53526945
Min length11

Characters and Unicode

Total characters661807
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrivate room
2nd rowEntire home/apt
3rd rowPrivate room
4th rowEntire home/apt
5th rowEntire home/apt
ValueCountFrequency (%)
Entire home/apt25409
52.0%
Private room22326
45.7%
Shared room1160
 
2.4%
2021-02-09T22:47:17.764741image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-02-09T22:47:17.940854image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
entire25409
26.0%
home/apt25409
26.0%
room23486
24.0%
private22326
22.8%
shared1160
 
1.2%

Most occurring characters

ValueCountFrequency (%)
e74304
11.2%
t73144
11.1%
r72381
10.9%
o72381
10.9%
a48895
 
7.4%
48895
 
7.4%
m48895
 
7.4%
i47735
 
7.2%
h26569
 
4.0%
E25409
 
3.8%
Other values (7)123199
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter538608
81.4%
Uppercase Letter48895
 
7.4%
Space Separator48895
 
7.4%
Other Punctuation25409
 
3.8%

Most frequent character per category

ValueCountFrequency (%)
e74304
13.8%
t73144
13.6%
r72381
13.4%
o72381
13.4%
a48895
9.1%
m48895
9.1%
i47735
8.9%
h26569
 
4.9%
n25409
 
4.7%
p25409
 
4.7%
Other values (2)23486
 
4.4%
ValueCountFrequency (%)
E25409
52.0%
P22326
45.7%
S1160
 
2.4%
ValueCountFrequency (%)
48895
100.0%
ValueCountFrequency (%)
/25409
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin587503
88.8%
Common74304
 
11.2%

Most frequent character per script

ValueCountFrequency (%)
e74304
12.6%
t73144
12.4%
r72381
12.3%
o72381
12.3%
a48895
8.3%
m48895
8.3%
i47735
8.1%
h26569
 
4.5%
E25409
 
4.3%
n25409
 
4.3%
Other values (5)72381
12.3%
ValueCountFrequency (%)
48895
65.8%
/25409
34.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII661807
100.0%

Most frequent character per block

ValueCountFrequency (%)
e74304
11.2%
t73144
11.1%
r72381
10.9%
o72381
10.9%
a48895
 
7.4%
48895
 
7.4%
m48895
 
7.4%
i47735
 
7.2%
h26569
 
4.0%
E25409
 
3.8%
Other values (7)123199
18.6%

price
Real number (ℝ≥0)

Distinct674
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean152.7206872
Minimum0
Maximum10000
Zeros11
Zeros (%)< 0.1%
Memory size382.1 KiB
2021-02-09T22:47:18.237745image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40
Q169
median106
Q3175
95-th percentile355
Maximum10000
Range10000
Interquartile range (IQR)106

Descriptive statistics

Standard deviation240.1541697
Coefficient of variation (CV)1.572505822
Kurtosis585.6728789
Mean152.7206872
Median Absolute Deviation (MAD)46
Skewness19.118939
Sum7467278
Variance57674.02525
MonotocityNot monotonic
2021-02-09T22:47:18.613037image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1002051
 
4.2%
1502047
 
4.2%
501534
 
3.1%
601458
 
3.0%
2001401
 
2.9%
751370
 
2.8%
801272
 
2.6%
651190
 
2.4%
701170
 
2.4%
1201130
 
2.3%
Other values (664)34272
70.1%
ValueCountFrequency (%)
011
< 0.1%
1017
< 0.1%
113
 
< 0.1%
124
 
< 0.1%
131
 
< 0.1%
ValueCountFrequency (%)
100003
< 0.1%
99993
< 0.1%
85001
 
< 0.1%
80001
 
< 0.1%
77031
 
< 0.1%

minimum_nights
Real number (ℝ≥0)

SKEWED

Distinct109
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.029962164
Minimum1
Maximum1250
Zeros0
Zeros (%)0.0%
Memory size382.1 KiB
2021-02-09T22:47:19.095736image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q35
95-th percentile30
Maximum1250
Range1249
Interquartile range (IQR)4

Descriptive statistics

Standard deviation20.51054953
Coefficient of variation (CV)2.917590316
Kurtosis854.0716624
Mean7.029962164
Median Absolute Deviation (MAD)2
Skewness21.82727453
Sum343730
Variance420.6826422
MonotocityNot monotonic
2021-02-09T22:47:19.396467image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
112720
26.0%
211696
23.9%
37999
16.4%
303760
 
7.7%
43303
 
6.8%
53034
 
6.2%
72058
 
4.2%
6752
 
1.5%
14562
 
1.1%
10483
 
1.0%
Other values (99)2528
 
5.2%
ValueCountFrequency (%)
112720
26.0%
211696
23.9%
37999
16.4%
43303
 
6.8%
53034
 
6.2%
ValueCountFrequency (%)
12501
 
< 0.1%
10001
 
< 0.1%
9993
< 0.1%
5005
< 0.1%
4801
 
< 0.1%

number_of_reviews
Real number (ℝ≥0)

ZEROS

Distinct394
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.27446569
Minimum0
Maximum629
Zeros10052
Zeros (%)20.6%
Memory size382.1 KiB
2021-02-09T22:47:19.698560image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median5
Q324
95-th percentile114
Maximum629
Range629
Interquartile range (IQR)23

Descriptive statistics

Standard deviation44.55058227
Coefficient of variation (CV)1.91413985
Kurtosis19.52978807
Mean23.27446569
Median Absolute Deviation (MAD)5
Skewness3.690634572
Sum1138005
Variance1984.75438
MonotocityNot monotonic
2021-02-09T22:47:19.961300image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
010052
20.6%
15244
 
10.7%
23465
 
7.1%
32520
 
5.2%
41994
 
4.1%
51618
 
3.3%
61357
 
2.8%
71179
 
2.4%
81127
 
2.3%
9964
 
2.0%
Other values (384)19375
39.6%
ValueCountFrequency (%)
010052
20.6%
15244
10.7%
23465
 
7.1%
32520
 
5.2%
41994
 
4.1%
ValueCountFrequency (%)
6291
< 0.1%
6071
< 0.1%
5971
< 0.1%
5941
< 0.1%
5761
< 0.1%

last_review
Categorical

HIGH CARDINALITY
MISSING

Distinct1764
Distinct (%)4.5%
Missing10052
Missing (%)20.6%
Memory size382.1 KiB
2019-06-23
 
1413
2019-07-01
 
1359
2019-06-30
 
1341
2019-06-24
 
875
2019-07-07
 
718
Other values (1759)
33137 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters388430
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique236 ?
Unique (%)0.6%

Sample

1st row2018-10-19
2nd row2019-05-21
3rd row2019-07-05
4th row2018-11-19
5th row2019-06-22
ValueCountFrequency (%)
2019-06-231413
 
2.9%
2019-07-011359
 
2.8%
2019-06-301341
 
2.7%
2019-06-24875
 
1.8%
2019-07-07718
 
1.5%
2019-07-02658
 
1.3%
2019-06-22655
 
1.3%
2019-06-16601
 
1.2%
2019-07-05580
 
1.2%
2019-07-06565
 
1.2%
Other values (1754)30078
61.5%
(Missing)10052
 
20.6%
2021-02-09T22:47:20.708139image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2019-06-231413
 
3.6%
2019-07-011359
 
3.5%
2019-06-301341
 
3.5%
2019-06-24875
 
2.3%
2019-07-07718
 
1.8%
2019-07-02658
 
1.7%
2019-06-22655
 
1.7%
2019-06-16601
 
1.5%
2019-07-05580
 
1.5%
2019-07-06565
 
1.5%
Other values (1754)30078
77.4%

Most occurring characters

ValueCountFrequency (%)
092333
23.8%
-77686
20.0%
162027
16.0%
258684
15.1%
930106
 
7.8%
619890
 
5.1%
712824
 
3.3%
810838
 
2.8%
59577
 
2.5%
38764
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number310744
80.0%
Dash Punctuation77686
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
092333
29.7%
162027
20.0%
258684
18.9%
930106
 
9.7%
619890
 
6.4%
712824
 
4.1%
810838
 
3.5%
59577
 
3.1%
38764
 
2.8%
45701
 
1.8%
ValueCountFrequency (%)
-77686
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common388430
100.0%

Most frequent character per script

ValueCountFrequency (%)
092333
23.8%
-77686
20.0%
162027
16.0%
258684
15.1%
930106
 
7.8%
619890
 
5.1%
712824
 
3.3%
810838
 
2.8%
59577
 
2.5%
38764
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII388430
100.0%

Most frequent character per block

ValueCountFrequency (%)
092333
23.8%
-77686
20.0%
162027
16.0%
258684
15.1%
930106
 
7.8%
619890
 
5.1%
712824
 
3.3%
810838
 
2.8%
59577
 
2.5%
38764
 
2.3%

reviews_per_month
Real number (ℝ≥0)

MISSING

Distinct937
Distinct (%)2.4%
Missing10052
Missing (%)20.6%
Infinite0
Infinite (%)0.0%
Mean1.37322143
Minimum0.01
Maximum58.5
Zeros0
Zeros (%)0.0%
Memory size382.1 KiB
2021-02-09T22:47:20.969685image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.04
Q10.19
median0.72
Q32.02
95-th percentile4.64
Maximum58.5
Range58.49
Interquartile range (IQR)1.83

Descriptive statistics

Standard deviation1.680441995
Coefficient of variation (CV)1.223722525
Kurtosis42.49346948
Mean1.37322143
Median Absolute Deviation (MAD)0.62
Skewness3.130188536
Sum53340.04
Variance2.823885299
MonotocityNot monotonic
2021-02-09T22:47:21.256837image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.02919
 
1.9%
0.05893
 
1.8%
1893
 
1.8%
0.03804
 
1.6%
0.16667
 
1.4%
0.04655
 
1.3%
0.08596
 
1.2%
0.09593
 
1.2%
0.06579
 
1.2%
0.11539
 
1.1%
Other values (927)31705
64.8%
(Missing)10052
 
20.6%
ValueCountFrequency (%)
0.0142
 
0.1%
0.02919
1.9%
0.03804
1.6%
0.04655
1.3%
0.05893
1.8%
ValueCountFrequency (%)
58.51
< 0.1%
27.951
< 0.1%
20.941
< 0.1%
19.751
< 0.1%
17.821
< 0.1%

calculated_host_listings_count
Real number (ℝ≥0)

Distinct47
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.143982002
Minimum1
Maximum327
Zeros0
Zeros (%)0.0%
Memory size382.1 KiB
2021-02-09T22:47:21.506847image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile15
Maximum327
Range326
Interquartile range (IQR)1

Descriptive statistics

Standard deviation32.95251885
Coefficient of variation (CV)4.612626241
Kurtosis67.5508883
Mean7.143982002
Median Absolute Deviation (MAD)0
Skewness7.9331739
Sum349305
Variance1085.868499
MonotocityNot monotonic
2021-02-09T22:47:21.762626image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
132303
66.1%
26658
 
13.6%
32853
 
5.8%
41440
 
2.9%
5845
 
1.7%
6570
 
1.2%
8416
 
0.9%
7399
 
0.8%
327327
 
0.7%
9234
 
0.5%
Other values (37)2850
 
5.8%
ValueCountFrequency (%)
132303
66.1%
26658
 
13.6%
32853
 
5.8%
41440
 
2.9%
5845
 
1.7%
ValueCountFrequency (%)
327327
0.7%
232232
0.5%
121121
 
0.2%
103103
 
0.2%
96192
0.4%

availability_365
Real number (ℝ≥0)

ZEROS

Distinct366
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.7813273
Minimum0
Maximum365
Zeros17533
Zeros (%)35.9%
Memory size382.1 KiB
2021-02-09T22:47:22.028264image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median45
Q3227
95-th percentile359
Maximum365
Range365
Interquartile range (IQR)227

Descriptive statistics

Standard deviation131.6222889
Coefficient of variation (CV)1.167057455
Kurtosis-0.9975340452
Mean112.7813273
Median Absolute Deviation (MAD)45
Skewness0.7634075771
Sum5514443
Variance17324.42692
MonotocityNot monotonic
2021-02-09T22:47:22.299728image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
017533
35.9%
3651295
 
2.6%
364491
 
1.0%
1408
 
0.8%
89361
 
0.7%
5340
 
0.7%
3306
 
0.6%
179301
 
0.6%
90290
 
0.6%
2270
 
0.6%
Other values (356)27300
55.8%
ValueCountFrequency (%)
017533
35.9%
1408
 
0.8%
2270
 
0.6%
3306
 
0.6%
4233
 
0.5%
ValueCountFrequency (%)
3651295
2.6%
364491
 
1.0%
363239
 
0.5%
362166
 
0.3%
361111
 
0.2%

Interactions

2021-02-09T22:46:40.428064image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:40.832873image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:41.145386image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:41.448290image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:41.776431image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:42.151086image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:42.438214image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:42.735103image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:43.035673image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:43.348189image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:43.619523image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:43.912842image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:44.183600image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:44.439413image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:44.736302image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:45.159822image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:45.431324image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:45.739475image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:46.043032image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:46.324296image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:46.580053image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:46.871845image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:47.162939image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:47.428574image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:47.713875image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:48.056553image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:48.340768image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:48.637664image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:49.006626image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:49.324939image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:49.606202image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:49.895486image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:50.176752image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:50.479469image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:50.817112image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:51.197248image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:51.549844image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:51.864561image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:52.192703image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:52.448458image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:52.791526image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:53.109718image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:53.396704image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:53.662347image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:53.933986image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:54.215255image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:54.517918image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:55.030806image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:55.486526image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:55.907965image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:56.295779image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:56.598506image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:56.920021image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:57.213271image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:57.513916image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:57.798819image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:58.076006image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:58.439042image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:58.734761image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:59.021753image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:59.334272image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:59.636994image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:46:59.942826image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:00.214360image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:00.464369image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:00.744559image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:01.031723image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:01.281736image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:01.521860image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:01.813686image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:02.147978image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:02.444867image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:02.716299image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:03.020778image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:03.340290image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:03.643027image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:03.952918image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:04.208684image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:04.489949image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:04.776640image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:05.063731image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:05.376251image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:05.694556image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:06.018841image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:06.321419image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:06.826421image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:07.123620image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:07.403708image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:07.723780image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-02-09T22:47:08.014642image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2021-02-09T22:47:22.565364image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-09T22:47:22.984605image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-09T22:47:23.380934image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-09T22:47:23.851795image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-02-09T22:47:24.216885image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-02-09T22:47:08.654355image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-09T22:47:09.708576image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-02-09T22:47:10.736918image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-02-09T22:47:11.130382image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

idnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
02539Clean & quiet apt home by the park2787JohnBrooklynKensington40.64749-73.97237Private room149192018-10-190.216365
12595Skylit Midtown Castle2845JenniferManhattanMidtown40.75362-73.98377Entire home/apt2251452019-05-210.382355
23647THE VILLAGE OF HARLEM....NEW YORK !4632ElisabethManhattanHarlem40.80902-73.94190Private room15030NaNNaN1365
33831Cozy Entire Floor of Brownstone4869LisaRoxanneBrooklynClinton Hill40.68514-73.95976Entire home/apt8912702019-07-054.641194
45022Entire Apt: Spacious Studio/Loft by central park7192LauraManhattanEast Harlem40.79851-73.94399Entire home/apt801092018-11-190.1010
55099Large Cozy 1 BR Apartment In Midtown East7322ChrisManhattanMurray Hill40.74767-73.97500Entire home/apt2003742019-06-220.591129
65121BlissArtsSpace!7356GaronBrooklynBedford-Stuyvesant40.68688-73.95596Private room6045492017-10-050.4010
75178Large Furnished Room Near B'way8967ShunichiManhattanHell's Kitchen40.76489-73.98493Private room7924302019-06-243.471220
85203Cozy Clean Guest Room - Family Apt7490MaryEllenManhattanUpper West Side40.80178-73.96723Private room7921182017-07-210.9910
95238Cute & Cozy Lower East Side 1 bdrm7549BenManhattanChinatown40.71344-73.99037Entire home/apt15011602019-06-091.334188

Last rows

idnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
4888536482809Stunning Bedroom NYC! Walking to Central Park!!131529729KendallManhattanEast Harlem40.79633-73.93605Private room7520NaNNaN2353
4888636483010Comfy 1 Bedroom in Midtown East274311461ScottManhattanMidtown40.75561-73.96723Entire home/apt20060NaNNaN1176
4888736483152Garden Jewel Apartment in Williamsburg New York208514239MelkiBrooklynWilliamsburg40.71232-73.94220Entire home/apt17010NaNNaN3365
4888836484087Spacious Room w/ Private Rooftop, Central location274321313KatManhattanHell's Kitchen40.76392-73.99183Private room12540NaNNaN131
4888936484363QUIT PRIVATE HOUSE107716952MichaelQueensJamaica40.69137-73.80844Private room6510NaNNaN2163
4889036484665Charming one bedroom - newly renovated rowhouse8232441SabrinaBrooklynBedford-Stuyvesant40.67853-73.94995Private room7020NaNNaN29
4889136485057Affordable room in Bushwick/East Williamsburg6570630MarisolBrooklynBushwick40.70184-73.93317Private room4040NaNNaN236
4889236485431Sunny Studio at Historical Neighborhood23492952Ilgar & AyselManhattanHarlem40.81475-73.94867Entire home/apt115100NaNNaN127
488933648560943rd St. Time Square-cozy single bed30985759TazManhattanHell's Kitchen40.75751-73.99112Shared room5510NaNNaN62
4889436487245Trendy duplex in the very heart of Hell's Kitchen68119814ChristopheManhattanHell's Kitchen40.76404-73.98933Private room9070NaNNaN123